Orchard Park
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Indiana (0.06)
- North America > United States > Michigan (0.05)
- (10 more...)
- Media > News (1.00)
- Leisure & Entertainment > Sports > Football (1.00)
MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability
Wu, Weiqi, Guan, Xin, Huang, Shen, Jiang, Yong, Xie, Pengjun, Huang, Fei, Cao, Jiuxin, Zhao, Hai, Zhou, Jingren
Retrieval-Augmented Language Models (RALMs) represent a classic paradigm where models enhance generative capabilities using external knowledge retrieved via a specialized module. Recent advancements in Agent techniques enable Large Language Models (LLMs) to autonomously utilize tools for retrieval, planning, and reasoning. While existing training-based methods show promise, their agentic abilities are limited by inherent characteristics of the task-specific data used during training. To further enhance the universal search capability of agents, we propose a novel pre-training framework, MaskSearch. In the pre-training stage, we introduce the Retrieval Augmented Mask Prediction (RAMP) task, where the model learns to leverage search tools to fill masked spans on a large number of pre-training data, thus acquiring universal retrieval and reasoning capabilities for LLMs. After that, the model is trained on downstream tasks to achieve further improvement. We apply both Supervised Fine-tuning (SFT) and Reinforcement Learning (RL) for training. For SFT, we combine agent-based and distillation-based methods to generate training data, starting with a multi-agent system consisting of a planner, rewriter, observer, and followed by a self-evolving teacher model. While for RL, we employ DAPO as the training framework and adopt a hybrid reward system consisting of answer rewards and format rewards. Additionally, we introduce a curriculum learning approach that allows the model to learn progressively from easier to more challenging instances based on the number of masked spans. We evaluate the effectiveness of our framework in the scenario of open-domain multi-hop question answering. Through extensive experiments, we demonstrate that MaskSearch significantly enhances the performance of LLM-based search agents on both in-domain and out-of-domain downstream tasks.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Kentucky (0.05)
- North America > United States > Illinois (0.04)
- (10 more...)
- Research Report (1.00)
- Workflow (0.94)
'Never, ever try to shoot at a drone.' Neighborhoods buzz with complaints over pesky drones
Here's what to do when a drone gets too close for comfort. Sam Sargent uses a DJI drone to get aerial photographs of a home for sale in the Crocker Highlands neighborhood of Oakland. Users here of the local social network Nextdoor for months have been stewing about these small, flying vehicles, which often carry cameras, accusing them of snooping or maybe casing the joint. They wonder if it's legal to fight back, say by lassoing the pesky vehicle flying outside their window – or even shooting it down with a potato gun. Oakland resident Katy O'Neill goes as far as blaming it for shattering her dining room window.
- South America > Venezuela (0.05)
- North America > United States > New York > Erie County > Orchard Park (0.05)
- North America > United States > New Jersey (0.05)
- (2 more...)
- Information Technology (1.00)
- Transportation > Air (0.99)
- Government > Regional Government > North America Government > United States Government (0.32)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
- Information Technology > Communications > Social Media (0.91)